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1. INTROBTJCTIOrJ 

Many fundarnenLal algorithms in numerical analysis include the 
evaluation of A'^-A or aJ-D-A, where k is a real m'XTi matrix and il is a 
diagonal mxm matrix. Examples are given in papers on factorization of 
matrices or problems of minimization in vrhich the Hessian has this form 
( Gay [1] Gonen & Avriel [3] ). The extended use of this product motivates 
the question of reducing its complexity. 



This research ras partially supported hy the NrS Foundation Research Program. 



The purposes of Lfiis paper are: 



1. To relate the computational complexity of A'^-D-A to the sparsity rate 
of the matrix A . 

2. For a given sparsity rate, to distinguish between the vmrst and the 
best case. 

3. To provide an application of these results. 

The problem of multiplying a transpose of a sparse matrix by itself 
was discussed in several books and papers e.g. George & Liu [2] in which 
they include the number of operations required for this multiplication. 
Gustavson [4] proposed an optimal algorithm for multiplying two sparse 
matrices A-3 v/here A£R^^^ and proving that the number ox 

multiplication N satisfies Q-< N -<rLrnk . However, the connection between 
the number of operation and the sparsity rate of the matrices was not 
discussed. 

Apparently, it seems that this question has only theoretical meaning 
since the matrix A is provided and therefore the number of operations is 
known. However, in this paper we will see there exist some cases in winch 
the configuration of this matrix A can be designed by the user. In these 
cases it make sense to analyze this product in order to reduce the 
number of operations. 

In section 2 of this paper, we present the computational complexity 
of A'^-D-A for several sparsity patterns of A. In this section, we establish 
our results on the assumption that the number of nonzero elements of 
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the matrix A is provided. We demonstrate the best and the ;vorst case, 
showing that in the best case, the nonzero elements are divided homo- 
geneously among the rows of A, while in the worst case, these nonzero 
elements are confined in a limited number of rows. 

In section 3 vre provide an example from optimization theory, in 
which the matrix A is dense and by appl>lng the results of section 2 we 
minimdze the number of multiplication in the evaluation of the Hessian. 

In this paper, all vector spaces are finite dimensional and vectors are 
colunan vectors. The space of all nxrr. matrices is denoted by ; the 

nonnegative orthant of the Euclidean space i?" is denoted by ; the 
subset of all integer vectors in is denoted by /", and its nonnegative 
orthant by 7’^. For a matrix .4 we denote by c;. and a<^ the i.-th row and 
the ;-th column respectively. The transpose of A is denoted by .4^. By 
the norm ] |.t| 1 we mean the Euclidean norm. For a real number r its 
integer part is denoted by [rj. Finally, the number of elements in the set 
B is denoted hy \B\ , and the number of zero elements in a matrix h is 
denoted by Z{A.). 

2. THE COfiPUTATICNAL COHPLSXIT/ 07 A'^'D- A. 

N 

Let A be in with N nonzero elements. The ratio is called 

TYVn 

the sparsity rate of the matrix A and denoted by cr(A). In this section vre 
assume that the sparsity rate of the matrix is provided and that 



each row of .4 includes, at least, one nonzero element. We concentrate on 



the sparsity pattern of A, looking for the best and the worst cases, by 
means of the number of operations required to compute A'^-D-A where D 
is a diagonal matrix Tfe begin our exploration in the worst case, 

in which the configuration of A implies the maximum number of multipli- 
cation. Let us denote by rru the number of nonzero elements in , thus 



i = l 



N. 



Our first Lemma provides us the number of op 
tions) required to accomplish the product ^^-.4. 



(2.1) 



rations (miulticlice.' 



Lemma 2.1; Let be a given sparse matrix, then the product A’^-A 

can be computed using 

1) (2.2) 

i = l 

multiplications. 

Proof: The product 4^-.4 can be revvTitten as a sumi of rz rank 1 matrices 

a" A = (2.3) 

1=1 

Tnc rank 1 matrices (ii» a7, arc symmetric. Each nonzero clcrncnt j of 
the vector a,;, is multiphed by all ether nonzero elements Oj.;. for . 
Tnerefore, the number of multiplication is 
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coinbirdng (2.3) vrith (2.4) yields Lhe proof of the lemma. 



From the proof above, it can easily be seen that the number of addi- 
tions are approximately the same as the number of multiplications since 
each term is accumulated into Lhe result matrix C; C=A'^ A. 



Ccrcllary 2.1; Let be a sparse matrix and a diagonal 

matrix then the product A'^-D-A can be com.puted by 

^/z'£rr^-{Tru-i- i.) N (2.5) 

i = l 

multiplications. 

Prcci: We first compute A = D- A v/hich requires N multiplications and 
then substituting by a/» in (2.3) yields the proof of the corollary. 

In order to find the sparsity pattern vrhich yields the vmrst case, >ve 
have to mammize (2.2) provided (2.1) and aU are positive integers. 
Since the difference betvfeen (2.2) and (2.5) is N, it is enough to explore 
the worst case for the product A'^-A that will yield the sam.e result for 
A’^ D-A. Consequently, a new problem, can be formulated as foUo'v. 



(Al) 



^ 77^■(Tr^ + 1) 

max 5 

i = l 



subject to the constraint 



£rrK:=jV 

i = l 



( 2 . 6 ) 
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and 



1< 771^ ; rTijC/* 



(2.7) 



m 



This problem can be reduced to maximizing under the same con- 

i=l 

straints. Denning x^-rrii—l yields the following problem 



(A2) max||a;||^ (2.8) 

subject to the constraint 

m 

2-i=vV-m (2.9) 

i=l 



and 



0<ri <71 - 1 , 



( 2 . 10 ) 



Vfs mil prove that since the objective function is convex its maximum is 
attained at a boundary point. An integer vector x^F’ is called a boundary 
point of problem (A2) if there exists a set J = \j and a 
unique joCL—J such that 



71 — 1 



ie/ 



Xi = 



(A — 77l) -T?(n - 1) 
0 



i =;o 
other^uiss 



(2.11) 



where 



13 = 



N -m 

71-1 



( 2 . 12 ) 



In this case the vector x=x+s where e = (l,...,l) is a boundary point of 
problem (Al). 



r-i 

- i - 

Fortunately, from the symmetric property of the objective function, 
the optimal value does not depend on the selected boundary point. 
Hence 

m 

= '6-{n - 1)^ + [(// -m) -'d (n-l)]^, (2. 13) 

i = l 

To prove that (2.11) is the solution of problem (A2) vre need the fol- 
lovring lemma 



Lemma 2.2 : Consider the integer problem 



(A3) 



max I 1 T I 



(2.14) 



subject to the constraints 






i=l 






(2.15) 



(2.13) 



where K and M are positive integers, FA<K. Problem (A3) has a solution 
x' satisfying 



la:'; \ ^ = 



(2.1'^) 



where t? is the integer part of ^denoted by 



, if and oni}'’ if 



n-M>K. 



(2. IS) 



Moreover, if (2.18) holds then every solution of problem (A3) satisfies 
(2.17). 
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Prcox: It is immediate that if n-I,l<K then there is no feasible solution 
to problem (A3). Therefore, let us assume that (2.18) holds and prove 
this lemma by induction on the dimension of 2 . If n = 1, then from (2.15) 
we have x = K ^ M . If K < M then = 0 and if K = M then x? = 1 . In both 
cases (2.17) is satisfied. Assuming the assertion is true for all n. n<Tn-l. 
Let us denote 




hence 




Since by the induction assumption, (2.17) holds for r/i-1 




K 



I ( 2 . 21 ) 



M 



L <o> t .'W wH r P — I 7 Y] f) <1 n <1 "PH P 1 T) H P T* *0 f H “i fii 

[../j 

K by M ) Consider the maximization problem (2.21) in two cases; 



1. 0<x^<p. 



In this case t? = t 5 and the problem is 




Substituting K-t3M by p , yields the maximization of 



-r p^- — 2px^ 4- cx^ 



(2.23) 



subject to the constraint 0<x^<p and x^ell . The maximum of (2.23) is 
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aLLained aL =p or - 0, and 



I I X I 1 2 = + p2 = t5^/2 + (K-iJIiy. 



(2.24) 



2 . p<Xf^<M . 



In this case tJ = i5 - 1 and the problem is 



max J (t? -1)M^ + (p + M)^ - 2{p -i- U )xj^ + 2x^| 




using the same arguments as in the first case, the maximum is attained 
atx^=iy, thus 



Appljnng Lemma 2.2 to problems (Al) and (A2) 3 /ields the follovring 
conclusion. 

Corollary 2.3: Every xeAJ satisfying (2. 1 1 ) is a solution to problem (A2). 
Proof: Suppose xe/!J satisfies (2.11), which mean that (2.13) holds. Sub- 

stituting i/=n-l and K = N-ra in Lemma 2.2 implies that (2.17) and 
(2.13) arc the same, and T.emma 2.2 implies that x is a solution of prob- 
lem (A2). 




In both cases (2.1?) holds, which complete our proof. 



a 



a 



A solution to problem (Al) can be established by setting 



- 10 - 



rrii, - 



n 



N - 771 — (n — 1) 4- 1 
1 



iej 
'i- =;'o 
othsTiuise 



(2.27) 



where satisfies (2.12) and / is a set of indices 1/| ='il and jc is an index 
not in J. The computational complexity of the product /J-A for the worst 
case is established by substituting (2.27) into (2.6) 

A'-u.:: = + (jV — 771, -lj(7l— 1)+ 1)^4- 772, -ij - 1 -r/.'']. (2.28) 



It can be seen that in the worst case some of the Tn.,.-!!! achieve the 
upper bound n , the others are zero and only one of the iTOi-th is scme- 
v/here between 0 and n. This mean that the matrix A has as many full 
rovfs as possible, the rest of the rows have one element, and one row con- 
tains the remairiing nonzero elements of N. 



In the next Lennma a new bound for the computational complexity is 
presented which enable us to relate the sparsity rate and the mathemati- 
cal effort. 



Lemma 2.4 The computational complexity of the product h can be 
bounded by 

/htc + 2/V). ' (2.29) 

Proof: Let us denote by 

cp{k ) = {k -n^ -r [(iV -771) - .t- • (77 -1) + 1]- + 771 -.1: - 1 + -V). (2.30) 



The first assertion is that 
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N — m 



n - 1 



\ ^ / N - m. 
n — I 



(2.31) 



If vfe denote by f = 



N 


— m 


N- 


-77X 


n 


- 1 


n * 


-1 



, then 0<f<l . A straightfonrard 



calculation yields that 



cp{ ^ _J^) = {N —m)-(n + 1) -r 771 + jV, 



OC j 



Hence 






r* 

Oo ^ 



,N-7n 2 j. , r Ar A' -ttj , . a , ^ i? , N -m , ^ , , -r\ 

~ ( -»f7^-^[iV-77i - -- _ ^ (ti -l)+<-(n -1) + 1]'-:-?7 i - -- -!-:-■ /) = 

= {iV -771 ) • (n + 1 ) -5-771 -i- iV -[ ^^-=^71 2 - 1) + 771 + .V - 1) -i- 1 )^ + f- 1 ] = 

= <rn“-{(f(n -1) + l)2-,f + l = ^.(l-f) (n -1)2 

since 0 < f < 1 the last expression is nonnegative vV'hich prove oijr first 
assertion. The rest of the proof is established by the follovring: 



Mu;c =]4^( 



N -m 



n - 1 



) < }^( = ^71 (N - 771) + 2N]. (2.34) 



As Tv'c can see, (2.29) provides us an elegant bound for the eomputa- 
tional complexity of the worst case. This bound is a good approximation 

AT '* 77 / 

to the computational complexity when ^ is close to its integer part. 

The difference betvreen the mathematical effort of computing A^-A in the 
worst case and this bound is actually provided in the right hand side of 
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(2.33) and it is 

(2.35) 

AT 

where t is the fraction part of — . 

71—1 

The bound in (2.29) can be expressed as a function of the sparsity 
rate by using the definition 



a(.4) = 



N 



(2.33) 



■v7hich leads to the follovring equality 

-m.) + 2N] = }^n77i{c{A){n -l-2)-l). (2.37) 

It is interesting to observe the connection between the bound in (2.29) 
and the mathematical effort to accomplish vnthout using sparsity 
method vrhich is 



+ l)m. (2.38) 

The difference betiveen (2.38) and (2.29) can be established by expanding 
these two formulas achieving 

)^n(n + l)?n — -m)n 4- 2.V] = )4(n. + 2)(m.-n -N). (2,39) 

Dividing and multiplying the right hand side of (2.39) by m.n yield the fol- 
lowing expression for the difference 

+ 2)mn(l - ct(/1)) (2.40) 



where ct( 4) is the sparsity rate of A. 
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2.2 The best case 



In our discussion, we cedi the case in v/hich we need the minimum 
number of multiplication to produce A'^-A provided that there are N 
nonzero elements in A the best case . The number of operations in the 
best case can be derived by minimizing 



g 7n. {77li ^ 1) 






(2.41] 



subject to (2.8) and (2.7). Tvithout the integer restriction it is immediate 
that since the objective function is convex, the solution Trill be the arith- 

metic mean, that is, for all i , ttiv - — . The restriction that all the ttu have 



m 



to be integral yields the solution 



N 



771 




m 



i€j 



(2.42) 



where L = and jL— /!= jV- 



JL 

m 



m. Consequently, the 



number of multiplication in the best case is 



_ r. 7n*'{7n,*+l) _ 

2 j p /’A 

i = l ^ 



m 



i)-(^iV tri)- 



N 

>1 



(2.4y) 



In order to present the magnitude cf the difference between the worst 



and the best case, let us assume that —and — — ^are integers. In this 

m n — 1 



case (2.28) holds "^vith equality and 
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Subtracting /^sc from iJ^ 3 neld 



IMix -l^bc -nm, -hN - 




= }in{N-m){l-aiA)). 



If vrs take, for e:<ample, jV =;47n(n + l) the difference v/iU b; 

v/'hile /it: = ^ ■ That means that ,tu-c, for large n, 

mately 50% more than ,Uic . 



(2.44) 

(2.45) 

771 fn — 1)^ 
B 

is approxi- 



a i\ppLicATiorj 



In this section we present an example in v/hich the product -D-A is 
required where D is a diagonal matrix and the pattern of A can be 
designed in order to reduce the computational effort. Since we are dis- 
cussing the number of zeroes in matrices, let us denote by Z{A) the 
number of zero elements in the matrix A. Consider the problem, intro- 
duced b}^ Gay [l] 

(PI) min ?(i') = 2p-;(r,,(2:)) (3.1) 

i = l 

where ,pi:E^E and T7i>n. Very often r{z) = is a 

linear function of 2 ; , (see for examiple Gonen &c Avriel [3], or the least 
square problem in Gay [1]) v^hich mean 

r(x) = A-z - b. (3.2) 

In this case, the gradient and Hessicin of 93 have particularly simple forms 

^'p{x) = A'^-p'{r{z)) • (3.3) 

= A^DA (3.4) 

where 

p'(r(x)) = 0'i(ri(x)) (3.5) 

and 

D = dia5'0''i(r;(x)),...,p'V.(vri(^))] (3.6) 

is the diagonal matrix vhth diagonal elements p"i{ri{x)) . Since v/e have a 
simple analytic presentation of the gradient and Hessian , it is reasonable 
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to consider using Newton method to construct a sequence of iterates 
which, under reasonable conditions, converge to a local minimizer. This 
mean that the product A^-D-A vhll be used each iteration and very often 
this computation is the most expensive part of the algorithm. The main 
idea is to accomplish an initial preparation step by factoring 

A = BQ (3.7) 



vrhere 

(Z(5)=n). The 



is nonsingular and has zeroes in it 



next step is to substitute Qz by y in (3.7) leading to the 



problem 



(?2) min ^j(o) = j^Piiniz)) 

Z = 1 



(3.8) 



ere 



-riy) = By -b. 



n'' 



To establish the connection bstvreen the t:70 problems, let 



introduce 



the fclloT.ing Lemma: 



Lemma 3.1: A point a:* satisfies sufficient conditions for minimium of 
problem PL vrith r{x) defined by (3.2) if and onl}^ if y' = Qx satisfies 
sulTicient conditions for minimum of problem P2. 

Prccf: The sufficient conditions for minimum of problem Pi, vrhere r(a;) 
satisfies (3.2), are: 



A'^-Vcf{Ax’) = 0 



(3.10) 



- i?- 



z'^ A^ '^^(p{Ax')Az > 0 (3.11) 

for all zf^O. Since A = B Q where ^ is a nonsingular matrix (3.10) is 
equivalent to 

B^V<p{By') = 0 (3.12) 

and (3.11) can be rewritten as 

z'^ qT-B'^’V^<p(Bij’)B’Q-z >0 (3.13) 

for all 2 7^0. Since Qz = 0 if and only if 2 = 0 our proof is completed. 



It is important to mention that from Lemma 3.1 ~7e can deduce that 
if h is a nonsingular square matrix then it is enough to minimize ^p{y) and 
the m-inimizer 2 ; * will satisfy 2 :* = A~^-y* . 

In our next lemma we introduce a set of matrices such that 

for every factorization of a matrix in this set; A = BQ w'here 5 is a non- 
singular matrix, the matrix B vail have at most - n zeroes 
{Z{B) - n). Next we shovf a practical method of factorizing a full 

ranked matrix vrliich achieve at least - n zeroes: in general v/e cannot 
expect more . 



Lemma 3.2: Let where m>n be a full rank matrix. Let A - [h,-/] 

be an m by n+m matrix. If any set of m columns of A are linearly 
independent then for every factorization A = BQ vrhere is a non- 
singular matrix and the matrix B will include , at least, 



1 

- io ’ 



n(m + l)-n^ nonzero elements, (that is, Z{B) < n^-n ). 

Proof: Consider the factorization AQ~'^ = B which can be witten as n 
identical linear systems 

A-{Q-^)*j - I-B>j = 0 j = (3.14) 



The coefficients matrix fi=[h,-/] has rank m and any 7 n^m submatrhi of 



7-i has full rank. Let us denote by x the vector 



3 tj 



in First we 



claim that x has at least m-vi nonzero elements. Suppose x has less then 
m + l nonzero elements then it has at least n zero elements. Suopose 



Xi=x, = ■ ■ ■ =x, =0 and define to be a submatrix of A mth 



columns where jVi; for all According to the lemma’s assump- 

tion, C is ncnsinguJar and therefore the onl]/ solution to C y =0 is y=0 
which mean GTj''- is zero. This contradicts our assumption that Q is non- 
singular. Therefore the matrices Q and B together have at least Tim +71 
nonzero clcmients. If wc assume that all the zeroes arc in B, wc still 
remain 'vvhth 7^(m + l)-w^ nonzero elements in B . 



Comment; Any Vandermonde matrix satisfies the conditions of 
Lemma 3.3 therefore there are infinitely many e:<am.ples of m.atriees for 
v*hich one cannot expect to get more than - n zeroes in B. 

Next we introduce a practical method to factorize a full ranked 
matrix A vrtth, at least, v.^-n zero elements in 3. 



- 19- 



Tbe faclorizaLion 



Let be a full rank matrix where m>n . Then we can write 



A = 



-4i 
Az ■ 



(3.15) 



Suppose hi is nonsingular ny.n matrix. In this case w^e can take 



B - 



I 

Az-Ar^ 



Q = [A,] 



(3.16) 



and there are n^-n zeroes in B. Hovrever, this factorization is the vrorst 
case of section 1. In order to accomplish a better factorization, let as 
assume that m >2n in this case we can write the matrix as follov?’: 



A 







(3.17) 



where is a nonsing^ular matrix, and Assume 



that As'Ai ^ can be factorized into L- U where L and 
triangular matrices respectively. 

C /-1 



B = 



h2-hf* • U~^ 
L 



Q = !7'/l, 



U are lo’ver and upper 



(3.18) 



v'/ill give us a factorization with n^-n zeroes in B and its form 'vvill be 
closer to uniform distribution of the zero elements among the rows of the 
matrix. 

It is interesting to observe cases in which the matrix A is not of full 
rank. We mil show that in some cases it is possible to achieve mere zeroes 
than the full rank case and in other cases, the opposite is true. 
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Lemma 3.3: Let where ran!c{Ai) < n. A sunicient condition for A to 

have a factorization A = BQ , where is a nonsingular matrix and 

3|2ch that Z{B) > n~ - n is that 

ranlc(A) + n < tti + 1 (3.19) 



Proof: Suppose that rc.rJc{A) = k , l<k <n . Without loss of generality v/e 
may assume that the first k columns of A are linearly independent and 
the last {n-k) columns are linear combinations of the first k columns. Let 
us wTite A = [Ai.Az] where and There exists a matrix 

such that Az = Ai'B . The matrix /li can be factorized to 
Ai - Bi'Qi according to (3.16) vfhere has k^-k zero elements, and 

QieR'~"’~ a ncnsingular matrix. Let BeR”^^^ be the matrix 7dth Bi in its 
first A' columns and zeroes in its last (n-k) columns and let 



Q = 



^1 ^ 1 ' 
0 I 



(3.20) 



Since Qi is nonsingular, Q is nonsingular and A = BQ. In this case B has 
at least k^ - k + 7n{v. - k) zeroes. Recall that the number of zeroes in B in 
the full rank case is 7^^ - n , it follows that k^ - k k m{n - k) > - n iff 

- A:(m + l) + n{m + 1) - > 0 iff k^ - n~ > (m + l){k - n) . Since k < n 

the last inequality will hold iff A: + tz < m + 1. This inequality is the 
sufficient condition in (3.19). 



Conclusions 
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V/e have seen in this paper a class of optimization problems for vhiich 
the Hessian matrix can be written as A'^-D A where and a 

diagonal matrix. We showed that in several cases, the matrix .4 can be 
partially designed by the user in order to reduce the number of nonzero 
elements to a minimum. In previous sections we explored the pattern of a 
sparse matrix vfith a given number of nonzero elements. We shov/ed that 
in order to rninirnize the computational complexity of A'^-D-A vre should 
divide the nonzero elements uniformly among the rows cf A and if the 
nonzero elem.ents are confined in certain row-s then the computational 
complexity is maximized. 

The difTsrsnce betw'een the evaluation of the product A'^-A by method, 
of dense matrices and the upper bound for the ororst case using sparse 
method is presented in (2.40). It can be seen that this difference depends 

linearly on the oroportion of zero elements in the matrix w-Mch is — — 

mrt 

. Furthermore, the saving in using sparse method is, at least, )>(7i-f2)Tn,n 
times this proportion. Since + and (2,30) arc both close for large 
m and n, the saving is at least the number of operations for the dense 
case times the proportion of the zeroes elements. 

Finally we demonstrated a practical method for factorizing a full 
ranked matrix A^R^~^^ into B-Q vrhere B has at least n^-n zero ele- 
ments. Furthermore, we presented a class of matrices A for which you 
cannot expect to get more than ri^ - n zero elements. 



Unfortunately , this factorization is not optimal sines the nonzero 
elements are not distributed uniformly among the rows and this question 
is still vdthout an answer. Secondly, we proved that we can achieve a 
least n® -n zero elements in 5 if ^ is full ranked or ran!c{A) + n < m + 1 . 
Yfe did not prove anything for matrices which eire not full rank and do not 
satisfy (3.19) . The author conjecture is that the theorem may apply also 
for this case. 
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